5. Derivatives and gradients using ForwardDiff

This example is also available as a Jupyter notebook that can be run locally The notebook can be found in the examples directory of the package. If the notebooks are missing, you may need to run using Pkg; Pkg.build().

Setup

using RigidBodyDynamics, StaticArrays, ForwardDiff
using Test, Random
Random.seed!(1); # to get repeatable results
 Activating environment at `~/work/RigidBodyDynamics.jl/RigidBodyDynamics.jl/examples/5. Derivatives and gradients using ForwardDiff/Project.toml`

Jacobians with respect to $q$ and $v$ - the naive way

First, we'll load our trusty double pendulum from a URDF:

srcdir = dirname(pathof(RigidBodyDynamics))
urdf = joinpath(srcdir, "..", "test", "urdf", "Acrobot.urdf")
mechanism = parse_urdf(urdf)
Spanning tree:
Vertex: world (root)
  Vertex: upper_link, Edge: shoulder
    Vertex: lower_link, Edge: elbow
No non-tree joints.

Of course, we can create a MechanismState for the double pendulum, and compute its momentum in some random state:

float64state = MechanismState(mechanism)
rand!(float64state)
momentum(float64state)
Momentum expressed in "world":
angular: [0.0801508429099656, 1.4308383504250024, 0.15951325559917823], linear: [-0.6978510021979175, 0.0, 0.33892351174664365]

But now suppose we want the Jacobian of momentum with respect to the joint velocity vector $v$. We can do this using the ForwardDiff.Dual type and the ForwardDiff.jacobian function. The ForwardDiff package implements forward-mode automatic differentiation.

To use ForwardDiff.jacobian we'll create a function that maps v (as a Vector) to momentum (as a Vector):

q = configuration(float64state)
function momentum_vec(v::AbstractVector{T}) where T
    # create a `MechanismState` that can handle the element type of `v` (which will be some `ForwardDiff.Dual`):
    state = MechanismState{T}(mechanism)

    # set the state variables:
    set_configuration!(state, q)
    set_velocity!(state, v)

    # return momentum converted to an `SVector` (as ForwardDiff expects an `AbstractVector`)
    Vector(SVector(momentum(state)))
end
momentum_vec (generic function with 1 method)

Let's first check that the function returns the same thing we got from float64state:

v = velocity(float64state)
@test momentum_vec(v) == SVector(momentum(float64state))
Test Passed

That works, so now let's compute the Jacobian with ForwardDiff:

J = ForwardDiff.jacobian(momentum_vec, v)
6×2 Array{Float64,2}:
  0.252338   0.157137
  4.51855    2.25777
  0.505187   0.194443
 -2.21197   -0.777771
  0.0        0.0
  1.06794    0.628547

At this point we note that the matrix J is simply the momentum matrix (in world frame) of the Mechanism. In this case, RigidBodyDynamics.jl has a specialized algorithm for computing this matrix, so let's verify the results:

A = momentum_matrix(float64state)
@test J ≈ Array(A) atol = 1e-12
Test Passed

Gradients with respect to $q$ can be computed in similar fashion.

Improving performance

Ignoring the fact that we have a specialized method available, let's look at the performance of using ForwardDiff.jacobian.

using BenchmarkTools
@benchmark ForwardDiff.jacobian($momentum_vec, $v)
BenchmarkTools.Trial: 
  memory estimate:  39.25 KiB
  allocs estimate:  417
  --------------
  minimum time:     36.301 μs (0.00% GC)
  median time:      54.501 μs (0.00% GC)
  mean time:        63.861 μs (12.01% GC)
  maximum time:     19.348 ms (99.46% GC)
  --------------
  samples:          10000
  evals/sample:     1

That's not great. Note all the allocations. We can do better by making the following modifications:

  1. use an in-place version of the jacobian function, ForwardDiff.jacobian!
  2. reimplement our momentum_vec function to be in-place as well
  3. don't create a new MechanismState every time

The third point is especially important; creating a MechanismState is expensive!

Regarding the second point, we could also just stop converting momentum from a StaticArrays.SVector to a Vector to avoid allocations. However, the solution of making the function in-place also applies when the size of the output vector is not known statically (e.g., for dynamics_bias!).

To facillitate reuse of MechanismStates while keeping the code nice and generic, we can use a StateCache object. StateCache is a container that stores MechanismStates of various types (associated with one Mechanism), and will ease the process of using ForwardDiff. Creating one is easy:

const statecache = StateCache(mechanism)
StateCache{…}(…)

MechanismStates of a given type can be accessed as follows (note that if a MechanismState of a certain type is already available, it will be reused):

float32state = statecache[Float32]
@test float32state === statecache[Float32]
Test Passed

Now we'll use the StateCache to reimplement momentum_vec, making it in-place as well:

function momentum_vec!(out::AbstractVector, v::AbstractVector{T}) where T
    # retrieve a `MechanismState` that can handle the element type of `v`:
    state = statecache[T]

    # set the state variables:
    set_configuration!(state, q)
    set_velocity!(state, v)

    # compute momentum and store it in `out`
    m = momentum(state)
    copyto!(out, SVector(m))
end
momentum_vec! (generic function with 1 method)

Check that the in-place version works as expected on Float64 inputs:

const out = zeros(6) # where we'll be storing our results
momentum_vec!(out, v)
@test out == SVector(momentum(float64state))
Test Passed

And use ForwardDiff.jacobian! to compute the Jacobian:

const result = DiffResults.JacobianResult(out, v)
const config = ForwardDiff.JacobianConfig(momentum_vec!, out, v)
ForwardDiff.jacobian!(result, momentum_vec!, out, v, config)
J = DiffResults.jacobian(result)
@test J ≈ Array(A) atol = 1e-12
Test Passed

Let's check the performance again:

@benchmark ForwardDiff.jacobian!($result, $momentum_vec!, $out, $v, $config)
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     856.938 ns (0.00% GC)
  median time:      1.114 μs (0.00% GC)
  mean time:        1.138 μs (0.00% GC)
  maximum time:     2.165 μs (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     65

That's much better. Do note that the specialized algorithm is still faster:

q = copy(configuration(float64state))
@benchmark begin
    set_configuration!($float64state, $q)
    momentum_matrix!($A, $float64state)
end
BenchmarkTools.Trial: 
  memory estimate:  0 bytes
  allocs estimate:  0
  --------------
  minimum time:     314.230 ns (0.00% GC)
  median time:      422.180 ns (0.00% GC)
  mean time:        423.727 ns (0.00% GC)
  maximum time:     700.849 ns (0.00% GC)
  --------------
  samples:          10000
  evals/sample:     239

Time derivatives

We can also use ForwardDiff to compute time derivatives. Let's verify that energy is conserved for the double pendulum in the absence of nonconservative forces (like damping). That is, we expect that the time derivative of the pendulum's total energy is zero when its state evolves according to the passive dynamics.

Let's first compute the joint acceleration vector $\dot{v}$ using the passive dynamics:

dynamicsresult = DynamicsResult(mechanism)
set_configuration!(float64state, q)
set_velocity!(float64state, v)
dynamics!(dynamicsresult, float64state)
v̇ = dynamicsresult.v̇
2-element SegmentedVector{JointID,Float64,Base.OneTo{JointID},Array{Float64,1}}:
  0.07968076794580498
 -4.798825359977608

Now for the time derivative of total energy. ForwardDiff has a derivative function that can be used to take derivatives of functions that map a scalar to a scalar. But in this example, we'll instead use ForwardDiff's Dual type directly. ForwardDiff.Dual represents a (potentially multidimensional) dual number, i.e., a type that stores both the value of a function evaluated at a certain point, as well as the partial derivatives of the function, again evaluated at the same point. See the ForwardDiff documentation for more information.

We'll create a vector of Duals representing the value and derivative of $q(t)$:

q̇ = v
q_dual = ForwardDiff.Dual.(q, q̇)
2-element Array{ForwardDiff.Dual{Nothing,Float64,1},1}:
 Dual{Nothing}(0.2972879845354616,0.3127069683360675)
 Dual{Nothing}(0.3823959677906078,0.00790928339056074)

Note: for the double pendulum, $\dot{q} = v$, but this is not the case in general for Mechanisms created using RigidBodyDynamics.jl. For example, the QuaternionSpherical joint type uses a unit quaternion to represent the joint configuration, but angular velocity (in body frame) to represent velocity. In general $\dot{q}$ can be computed from the velocity vector $v$ stored in a MechanismState using

configuration_derivative(::MechanismState)

or its in-place variant, configuration_derivative!.

We'll do the same thing for $v(t)$:

v_dual = ForwardDiff.Dual.(v, v̇)
2-element Array{ForwardDiff.Dual{Nothing,Float64,1},1}:
 Dual{Nothing}(0.3127069683360675,0.07968076794580498)
 Dual{Nothing}(0.00790928339056074,-4.798825359977608)

Now we're ready to compute the total energy (kinetic + potential) using these ForwardDiff.Dual inputs. We'll use our StateCache again:

T = eltype(q_dual)
state = statecache[T]
set_configuration!(state, q_dual)
set_velocity!(state, v_dual)
energy_dual = kinetic_energy(state) + gravitational_potential_energy(state)
Dual{Nothing}(-21.472905435008563,0.0)

Note that the result type of energy_dual is again a ForwardDiff.Dual. We can extract the energy and its time derivative (mechanical power) from energy_dual as follows:

energy = ForwardDiff.value(energy_dual)
partials = ForwardDiff.partials(energy_dual)
power = partials[1];

So the total energy in the system is:

energy
-21.472905435008563

Note: the total energy is negative because the origin of the world frame is used as a reference for computing gravitational potential energy, i.e., the center of mass of the double pendulum is somewhere below this origin.

And we can verify that, indeed, there is no power flow into or out of the system:

@test power ≈ 0 atol = 1e-14
Test Passed

This page was generated using Literate.jl.